Social Web Meets Sensor Web: From User-Generated Content to Linked Crowdsourced Observation Data

نویسندگان

  • Dong-Po Deng
  • Guan-Shuo Mai
  • Tyng-Ruey Chuang
  • Rob Lemmens
  • Kwang-Tsao Shao
چکیده

The reach of dominating social media like Facebook and Twitter in the current population is enormous, and these media have long been leveraged for diverse applications. In particular, for some citizen science projects, existing social media increasingly become platforms on which participants interact and contribute. These user contributions, often termed User-Generated Content (UGC), can be a mix bag of posts, comments, images, and other media. We report in this paper a work-in-progress in formalizing user contributions from a large Facebook group (more than 4,000 users) established for biodiversity observation. A major part of our work is to extract structured datasets with welldefined semantics from unstructured UGC collections. We use common vocabularies from Darwin Core (DwC), Friendof-a-friend (FOAF), Semantically-Interlinked Online Communities (SIOC), Semantic Sensor Network (SSN), among ∗This research is supported in part by the Ministry of Science and Technology (grant no. 102-2627-M-001-009) and by the Endemic Species Research Institute, Council of Agriculture, Taiwan. We are grateful to Te–En Lin and his group at the Endemic Species Research Institute for their help with the collected data. †Dong–Po Deng is also a PhD candidate at the Faculty of Geo–Information Science and Earth Observation (ITC), University of Twente. ‡Tyng–Ruey Chuang is also affiliated with the Research Center for Information Technology Innovation and the Research Center for Humanities and Social Sciences (Center for Geographic Information Science), both at Academia Sinica. This paper is released under the Creative Commons Attribution 4.0 License. You are free to share and adapt this paper for any purpose, even commercially, as long as you give appropriate credit, provide a link to the license, and indicate if changes were made. These freedoms cannot be revoked as long as you follow the license terms. For a copy of the license, please visit . Linked Data on the Web (LDOW2014), April 8, 2014. Seoul, Korea. . others, to formalize the extracted datasets, hence, make them readily linkable. A nice consequence of this approach is that a multi-faceted browser can be quickly built to explore biodiversity information in large collections of UGC.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Managing Quality of Crowdsourced Data

The Web is the central medium for discovering knowledge via various sources such as blogs, social media, and wikis. It facilitates access to contents provided by a large number of users, regardless of their geographical locations or cultural backgrounds. Such user-generated content is often referred to as crowdsourced data, which provides informational benefit in terms of variety and scale. Yet...

متن کامل

Developing Knowledge Models of Social Media: A Case Study on LinkedIn

User Generated Content (UGC) exchanged [1] via large Social Network is considered a very important knowledge source about all aspects of the social engagements (e.g. interests, events, personal information, personal preferences, social experience, skills etc.). However this data is inherently unstructured or semi-structured. In this paper, we describe the results of a case study on LinkedIn Ire...

متن کامل

A Web Mashup for Social Libraries

User-generated content on the Social Web is often locked within information silos. Inadequate APIs or, worst, the lack of APIs obstruct reuse and prevent the opportunity to integrate similar content from different communities. In this paper we present a Web mashup which combines information from different social libraries. Aggregated information, including both classic book metadata and user-ge...

متن کامل

Data Extraction using Content-Based Handles

In this paper, we present an approach and a visual tool, called HWrap (Handle Based Wrapper), for creating web wrappers to extract data records from web pages. In our approach, we mainly rely on the visible page content to identify data regions on a web page. In our extraction algorithm, we inspired by the way a human user scans the page content for specific data. In particular, we use text fea...

متن کامل

Special issue on real-time and ubiquitous social semantics

The Web has shifted from its initial document and librarian paradigm to an ecology of socially-generated data and services. Websites such as Twitter, Facebook, and FourSquare, emphasise the huge popularity of sharing information in real-time. In addition, the wealth and breadth of applications that exploit open social networking APIs to provide new services and functionalities are growing rapid...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014